rating prediction
HADSF: Aspect Aware Semantic Control for Explainable Recommendation
Recent advances in large language models (LLMs) promise more effective information extraction for review-based recommender systems, yet current methods still (i) mine free-form reviews without scope control, producing redundant and noisy representations, (ii) lack principled metrics that link LLM hallucination to downstream effectiveness, and (iii) leave the cost-quality trade-off across model scales largely unexplored. We address these gaps with the Hyper-Adaptive Dual-Stage Semantic Framework (HADSF), a two-stage approach that first induces a compact, corpus-level aspect vocabulary via adaptive selection and then performs vocabulary-guided, explicitly constrained extraction of structured aspect-opinion triples. To assess the fidelity of the resulting representations, we introduce Aspect Drift Rate (ADR) and Opinion Fidelity Rate (OFR) and empirically uncover a nonmonotonic relationship between hallucination severity and rating prediction error. Experiments on approximately 3 million reviews across LLMs spanning 1.5B-70B parameters show that, when integrated into standard rating predictors, HADSF yields consistent reductions in prediction error and enables smaller models to achieve competitive performance in representative deployment scenarios. We release code, data pipelines, and metric implementations to support reproducible research on hallucination-aware, LLM-enhanced explainable recommendation. Code is available at https://github.com/niez233/HADSF
Memory Assisted LLM for Personalized Recommendation System
Large language models (LLMs) have demonstrated significant potential in solving recommendation tasks. With proven capabilities in understanding user preferences, LLM personalization has emerged as a critical area for providing tailored responses to individuals. Current studies explore personalization through prompt design and fine-tuning, paving the way for further research in personalized LLMs. However, existing approaches are either costly and inefficient in capturing diverse user preferences or fail to account for timely updates to user history. To address these gaps, we propose the Memory-Assisted Personalized LLM (MAP). Through user interactions, we first create a history profile for each user, capturing their preferences, such as ratings for historical items. During recommendation, we extract relevant memory based on similarity, which is then incorporated into the prompts to enhance personalized recommendations. In our experiments, we define a new task that enables testing with varying memory size under two scenarios: single domain where memory and tasks are from the same category and cross-domain (e.g. memory from movies and recommendation tasks in books). The results show that MAP outperforms regular LLM-based recommenders that integrate user history directly through prompt design. Moreover, as user history grows, MAP's advantage increases in both scenarios, making it more suitable for addressing successive personalized user requests.
ELIXIR: Efficient and LIghtweight model for eXplaIning Recommendations
Kabongo, Ben, Guigue, Vincent, Lemberger, Pirmin
Collaborative filtering drives many successful recommender systems but struggles with fine-grained user-item interactions and explainability. As users increasingly seek transparent recommendations, generating textual explanations through language models has become a critical research area. Existing methods employ either RNNs or Transformers. However, RNN-based approaches fail to leverage the capabilities of pre-trained Transformer models, whereas Transformer-based methods often suffer from suboptimal adaptation and neglect aspect modeling, which is crucial for personalized explanations. We propose ELIXIR (Efficient and LIghtweight model for eXplaIning Recommendations), a multi-task model combining rating prediction with personalized review generation. ELIXIR jointly learns global and aspect-specific representations of users and items, optimizing overall rating, aspect-level ratings, and review generation, with personalized attention to emphasize aspect importance. Based on a T5-small (60M) model, we demonstrate the effectiveness of our aspect-based architecture in guiding text generation in a personalized context, where state-of-the-art approaches exploit much larger models but fail to match user preferences as well. Experimental results on TripAdvisor and RateBeer demonstrate that ELIXIR significantly outperforms strong baseline models, especially in review generation.
RSAttAE: An Information-Aware Attention-based Autoencoder Recommender System
Taromi, Amirhossein Dadashzadeh, Heydari, Sina, Hooshmand, Mohsen, Ramezani, Majid
Recommender systems play a crucial role in modern life, including information retrieval, the pharmaceutical industry, retail, and entertainment. The entertainment sector, in particular, attracts significant attention and generates substantial profits. This work proposes a new method for predicting unknown user-movie ratings to enhance customer satisfaction. To achieve this, we utilize the MovieLens 100K dataset. Our approach introduces an attention-based autoencoder to create meaningful representations and the XGBoost method for rating predictions. The results demonstrate that our proposal outperforms most of the existing state-of-the-art methods. Availability: github.com/ComputationIASBS/RecommSys
Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Jaggi, Harbani, Murali, Kashyap, Fleisig, Eve, Bฤฑyฤฑk, Erdem
When annotators disagree, predicting the labels given by individual annotators can capture nuances overlooked by traditional label aggregation. We introduce three approaches to predicting individual annotator ratings on the toxicity of text by incorporating individual annotator-specific information: a neural collaborative filtering (NCF) approach, an in-context learning (ICL) approach, and an intermediate embedding-based architecture. We also study the utility of demographic information for rating prediction. NCF showed limited utility; however, integrating annotator history, demographics, and survey information permits both the embedding-based architecture and ICL to substantially improve prediction accuracy, with the embedding-based architecture outperforming the other methods. We also find that, if demographics are predicted from survey information, using these imputed demographics as features performs comparably to using true demographic data. This suggests that demographics may not provide substantial information for modeling ratings beyond what is captured in survey responses. Our findings raise considerations about the relative utility of different types of annotator information and provide new approaches for modeling annotators in subjective NLP tasks.
Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation
Kim, Jieyong, Kim, Hyunseo, Cho, Hyunjin, Kang, SeongKu, Chang, Buru, Yeo, Jinyoung, Lee, Dongha
Recent advancements in Large Language Models (LLMs) have demonstrated exceptional performance across a wide range of tasks, generating significant interest in their application to recommendation systems. However, existing methods have not fully capitalized on the potential of LLMs, often constrained by limited input information or failing to fully utilize their advanced reasoning capabilities. To address these limitations, we introduce Exp3rt, a novel LLM-based recommender designed to leverage rich preference information contained in user and item reviews. Exp3rt is basically fine-tuned through distillation from a teacher LLM to perform three key tasks in order: Exp3rt first extracts and encapsulates essential subjective preferences from raw reviews, aggregates and summarizes them according to specific criteria to create user and item profiles. It then generates detailed step-by-step reasoning followed by predicted rating, i.e., reasoning-enhanced rating prediction, by considering both subjective and objective information from user/item profiles and item descriptions. This personalized preference reasoning from Exp3rt enhances rating prediction accuracy and also provides faithful and reasonable explanations for recommendation. Extensive experiments show that Exp3rt outperforms existing methods on both rating prediction and candidate item reranking for top-k recommendation, while significantly enhancing the explainability of Figure 1: Comparison between existing LLM-based recommenders recommendation systems.
Leveraging LLM Reasoning Enhances Personalized Recommender Systems
Tsai, Alicia Y., Kraft, Adam, Jin, Long, Cai, Chenwei, Hosseini, Anahita, Xu, Taibai, Zhang, Zemin, Hong, Lichan, Chi, Ed H., Yi, Xinyang
Recent advancements have showcased the potential of Large Language Models (LLMs) in executing reasoning tasks, particularly facilitated by Chain-of-Thought (CoT) prompting. While tasks like arithmetic reasoning involve clear, definitive answers and logical chains of thought, the application of LLM reasoning in recommendation systems (RecSys) presents a distinct challenge. RecSys tasks revolve around subjectivity and personalized preferences, an under-explored domain in utilizing LLMs' reasoning capabilities. Our study explores several aspects to better understand reasoning for RecSys and demonstrate how task quality improves by utilizing LLM reasoning in both zero-shot and finetuning settings. Additionally, we propose RecSAVER (Recommender Systems Automatic Verification and Evaluation of Reasoning) to automatically assess the quality of LLM reasoning responses without the requirement of curated gold references or human raters. We show that our framework aligns with real human judgment on the coherence and faithfulness of reasoning responses. Overall, our work shows that incorporating reasoning into RecSys can improve personalized tasks, paving the way for further advancements in recommender system methodologies.
Privacy-preserving recommender system using the data collaboration analysis for distributed datasets
Yanagi, Tomoya, Ikeda, Shunnosuke, Sukegawa, Noriyoshi, Takano, Yuichi
In order to provide high-quality recommendations for users, it is desirable to share and integrate multiple datasets held by different parties. However, when sharing such distributed datasets, we need to protect personal and confidential information contained in the datasets. To this end, we establish a framework for privacy-preserving recommender systems using the data collaboration analysis of distributed datasets. Numerical experiments with two public rating datasets demonstrate that our privacy-preserving method for rating prediction can improve the prediction accuracy for distributed datasets. This study opens up new possibilities for privacy-preserving techniques in recommender systems.
RecGPT: Generative Pre-training for Text-based Recommendation
We present the first domain-adapted and fully-trained large language model, RecGPT-7B, and its instruction-following variant, RecGPT-7B-Instruct, for text-based recommendation. Experimental results on rating prediction and sequential recommendation tasks show that our model, RecGPT-7B-Instruct, outperforms previous strong baselines. We are releasing our RecGPT models as well as their pre-training and fine-tuning datasets to facilitate future research and downstream applications in text-based recommendation. Public "huggingface" links to our RecGPT models and datasets are available at: https://github.com/VinAIResearch/RecGPT
All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating Prediction
Fang, Shuheng, Zhao, Kangfei, Rong, Yu, Li, Zhixun, Yu, Jeffrey Xu
Cold-start rating prediction is a fundamental problem in recommender systems that has been extensively studied. Many methods have been proposed that exploit explicit relations among existing data, such as collaborative filtering, social recommendations and heterogeneous information network, to alleviate the data insufficiency issue for cold-start users and items. However, the explicit relations constructed based on data between different roles may be unreliable and irrelevant, which limits the performance ceiling of the specific recommendation task. Motivated by this, in this paper, we propose a flexible framework dubbed heterogeneous interaction rating network (HIRE). HIRE dose not solely rely on the pre-defined interaction pattern or the manually constructed heterogeneous information network. Instead, we devise a Heterogeneous Interaction Module (HIM) to jointly model the heterogeneous interactions and directly infer the important interactions via the observed data. In the experiments, we evaluate our model under three cold-start settings on three real-world datasets. The experimental results show that HIRE outperforms other baselines by a large margin. Furthermore, we visualize the inferred interactions of HIRE to confirm the contribution of our model.